Ongoing Study for Enhancing Chinese-Spanish Translation with Morphology Strategies
نویسنده
چکیده
Chinese and Spanish have different morphology structures, which poses a big challenge for translating between this pair of languages. In this paper, we analyze several strategies to better generalize from the Chinese non-morphology-based language to the Spanish rich morphologybased language. Strategies use a first-step of Spanish morphology-based simplifications and a second-step of fullform generation. The latter can be done using a translation system or classification methods. Finally, both steps are combined either by concatenation in cascade or integration using a factored-based style. Ongoing experiments (based on the United Nations corpus) and their results are described.
منابع مشابه
Enhancing scarce-resource language translation through pivot combinations
Chinese and Spanish are the most spoken languages in the world. However, there is not much research done in machine translation for this language pair. We experiment with the parallel Chinese-Spanish corpus (United Nations) to explore alternatives of SMT strategies which consist on using a pivot language. Particularly, two well-known alternatives are shown for pivoting: the cascade system and t...
متن کاملExploring Spanish-morphology effects on Chinese–Spanish SMT
This paper presents some statistical machine translation results among English, Spanish and Chinese, and focuses on exploring Spanish-morphology effects on the Chinese to Spanish translation task. Although not strictly comparable, it is observed that by reducing Spanish morphology the accuracy achieved in the Chinese to Spanish translation task becomes comparable to the one achieved in the Chin...
متن کاملComparative Evaluation of Spanish Segmentation Strategies for Spanish-Chinese Transliteration
This work presents a comparative evaluation among three different Spanish segmentation strategies for Spanish-Chinese transliteration. The transliteration task is implemented by means of Statistical Machine Translation, using Chinese characters and Spanish sub-word segments as the textual units to be translated. Three different Spanish segmentation strategies are evaluated: character-based, syl...
متن کاملEvaluating Indirect Strategies for Chinese - Spanish Statistical Machine Translation: Extended Abstract
Although, Chinese and Spanish are two of the most spoken languages in the world, not much research has been done in machine translation for this language pair. This paper focuses on investigating the state-of-the-art of Chinese-to-Spanish statistical machine translation (Smt), which nowadays is one of the most popular approaches to machine translation. For this purpose, we report details of the...
متن کاملA Feasibility Study for Chinese-Spanish Statistical Machine Translation
This article presents and describes an experimental prototype system for performing Chinese-to-Spanish and Spanish-to-Chinese machine translation. The system is based on the statistical machine translation (SMT) framework and, more specifically, it implements the bilingual n-gram SMT approach. Since, as far as we know, no large Chinese-Spanish parallel corpus is currently available for training...
متن کامل